Distributed Diffusion
https://gyazo.com/ca0dc598801d5c2adeeceb9559b38bef
Basically, peers get a small chunk of the dataset and train on it. Once all peers globally have reached a certain number of steps (together), they syncronize and share gradients (learning data). This happens in 5 minutes under good conditions, and then continue repeating the process.
> This process should be able to scale almost linearly, depending mostly on the reach of the DHT network.
> This is able to be ran by anyone with two computers, two gpus, one large drive, and a good customer grade bandwidth (70mbps+).